Add the SqliteDosStorage storage backend #6148
Conversation
|
@sphuber this is awesome 👍. It would also be perfect if it was possible to use this for the test suite. Perhaps there could be some pytest switch that would allow changing the storage backend? It is also currently a bit annoying that the backend is loaded unconditionally, even if you run just a single test file that does not require it. Again, a pytest switch that disables the session fixture that loads the backend would be awesome (and I suspect would speed up iteration during development). |
giovannipizzi
left a comment
There was a problem hiding this comment.
Indeed, great! I left a small comment for a documentation string (maybe you can check if there are other similar cases in the code).
I fully support Daniel's comment - maybe open an issue and we look into it once this is merged?
I also opened aiidateam/aiida-tutorials#468 for the future, when we'll prepare the next tutorial
Finally, one important note. In the previous coding week, we've been discussing having a documentation page for the various backends, and have it properly linked in the docs so people see it when they setup AiiDA. And this was combined with a simpler verdi profile setup mechanism. I think @superstar54 , @unkcpz and/or @eimrek might have been working on this? Has this ever been merged? I recommend adding the documentation for these new backend as well. Either we open an issue now and plan to fix it soon, or if that documentation is already in the docs, we could fix it directly in this PR
There was a problem hiding this comment.
Shall this still be speaking explicitly of PostgreSQL? Maybe OK to keep but add "E.g., to avoid hitting..."
|
Thanks for the review @giovannipizzi
Regarding switching off the automatic loading of the backend in the session fixture: we had discussed this in the past, but decided against it since it is used by plugins and so changing it would be breaking, as they would have to start adding it manually. I agree that having it off by default would still be better, so maybe we can revisit whether this is an acceptable breaking change for plugins. As for changing the backend plugin: I already have the required scaffolding in place in the
I was not aware of this effort but agree that this would be very useful. This ties neatly into why we cannot merge this PR just yet. There is still some functionality that is not supported by this backend, mostly related to the querybuilder:
Since this PR was anyway dependent on a bunch of other PRs being merged first, I had put these final details on hold. But will try to find time soon to work on them. Update: It seems the part of JSON query functionality that is not supported includes at least the |
2155e14 to
aaadf59
Compare
@giovannipizzi I missed this comment, sorry. |
The PR of @superstar54 uses the |
|
|
026a253 to
758e00d
Compare
The implementation subclasses the `PsqlDosBackend` and replaces the
PostgreSQL database with an sqlite database. By doing so, the
initialization of the storage only requires a directory on the local
file system where it will create the sqlite file for the database and a
container for the disk-objectstore.
The advantage of this `sqlite_dos` storage over the default `psql_dos`
is that it doesn't require a system service like PostgreSQL. As a
result, creating the storage is very straightforward and can be done
with almost no setup. The advantage over the existing `sqlite_zip` is
that the `sqlite_dos` is not read-only but can be used to write data
as well.
Combined with the `verdi profile setup` command, a working profile can
be created with a single command:
verdi profile setup core.sqlite_dos -n --profile name --email e@mail
This makes this storage backend very useful for tutorials and demos
that don't rely on performance.
The `store` method of the `SqliteEntityOverride` class, used by the `SqliteZipBackend` storage backend (and with that all other backends to subclass this), did not return `self`. This is in conflict with the signature of the base class that it is overriding. Since the `SqliteZipBackend` is read-only and so `store` would never be called, this problem went unnoticed. However, with the addition of the `SqliteDosStorage` backend which is *not* read-only, this bug would surface when trying to store a node since certain methods rely on this method returning the node instance itself.
The storage backends that use sqlite instead of PostgreSQL, i.e.,
`core.sqlite_dos`, `core.sqlite_temp` and `core.sqlite_zip`, piggy back
of the ORM models defined by the `core.psql_dos` backend by dynamically
converting to the sqlite equivalent database models.
The current implementation of `SqlaGroup.count` would except when used
with an sqlite backend since certain columns would be ambiguously
defined:
sqlite3.OperationalError: ambiguous column name: db_dbgroup.id
This is fixed by explicitly wrapping the classes that are joined in
`sqlalchemy.orm.aliased` which will force sqlalchemy to properly alias
each class removing the ambiguity.
758e00d to
8087995
Compare
The implementation subclasses the
PsqlDosBackendand replaces thePostgreSQL database with an sqlite database. By doing so, the
initialization of the storage only requires a directory on the local
file system where it will create the sqlite file for the database and a
container for the disk-objectstore.
The advantage of this
sqlite_dosstorage over the defaultpsql_dosis that it doesn't require a system service like PostgreSQL. As a
result, creating the storage is very straightforward and can be done
with almost no setup. The advantage over the existing
sqlite_zipisthat the
sqlite_dosis not read-only but can be used to write dataas well.
Combined with the
verdi profile setupcommand, a working profile canbe created with a single command:
This makes this storage backend very useful for tutorials and demos
that don't rely on performance.